Boosting Open Information Extraction with Noun-Based Relations

نویسندگان

  • Clarissa Castellã Xavier
  • Vera Lúcia Strube de Lima
چکیده

Open Information Extraction (Open IE) is a strategy for learning relations from texts, regardless the domain and without predefining these relations. Work in this area has focused mainly on verbal relations. In order to extend Open IE to extract relationships that are not expressed by verbs, we present a novel Open IE approach that extracts relations expressed in noun compounds (NCs), such as (oil, extracted from, olive) from “olive oil”, or in adjective-noun pairs (ANs), such as (moon, that is, gorgeous) from “gorgeous moon”. The approach consists of three steps: detection of NCs and ANs, interpretation of these compounds in view of corpus enrichment and extraction of relations from the enriched corpus. To confirm the feasibility of this method we created a prototype and evaluated the impact of the application of our proposal in two state-of-the-art Open IE extractors. Based on these tests we conclude that the proposed approach is an important step to fulfil the gap concerning the extraction of relations within the noun compounds and adjective-noun pairs in Open IE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Open Information Extraction with Tree Kernels

Traditional relation extraction seeks to identify pre-specified semantic relations within natural language text, while open Information Extraction (Open IE) takes a more general approach, and looks for a variety of relations without restriction to a fixed relation set. With this generalization comes the question, what is a relation? For example, should the more general task be restricted to rel...

متن کامل

Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-...

متن کامل

Using LSA and Noun Coordination Information to Improve the Recall and Precision of Automatic Hyponymy Extraction

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-...

متن کامل

ReNoun: Fact Extraction for Nominal Attributes

Search engines are increasingly relying on large knowledge bases of facts to provide direct answers to users’ queries. However, the construction of these knowledge bases is largely manual and does not scale to the long and heavy tail of facts. Open information extraction tries to address this challenge, but typically assumes that facts are expressed with verb phrases, and therefore has had diff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014